Parametric trajectory mixtures for LVCSR
نویسندگان
چکیده
Parametric trajectory models explicitly represent the temporal evolution of the speech features as a Gaussian process with time-varying parameters. HMMs are a special case of such models, one in which the trajectory constraints in the speech segment are ignored by the assumption of conditional independence across frames within the segment. In this paper, we investigate in detail some extensions to our trajectory modeling approach aimed at improving LVCSR performance: (i) improved modeling of mixtures of trajectories via better initialization, (ii) modeling of context dependence, and (iii) improved segment boundaries by means of search. We will present results in terms of both phone classi cation and recognition accuracy on the Switchboard corpus.
منابع مشابه
Introduce Segmeantal Inner Timewarping into Parametric Trajectory Segment Model for LVCSR
In this paper, a parametric trajectory segment model (PTSM) with segmental inner time warping is proposed to improve the recognition accuracy of large vocabulary continuous speech recognition(LVCSR). The proposed PTSM utilizes the state boundary information provided by HMM system during decoding to do segmental inner time warping. Good alignment between different length realizations of a same p...
متن کاملTrajectory Analysis Using Switched Motion Fields: A Parametric Approach
This paper presents a new model for trajectories in video sequences using mixtures of motion fields. Each field is described by a simple parametric model with only a few parameters. We show that, despite the simplicity of the motion fields, the overall model is able to generate complex trajectories occuring in video analysis.
متن کاملIPA Japanese Dictation Free Software Project
Large vocabulary continuous speech recognition (LVCSR) is an important basis for the application development of speech recognition technology. We had constructed Japanese common LVCSR speech database and have been developing sharable Japanese LVCSR programs/models by the volunteer-based efforts. We have been engaged in the following two volunteer-based activities. a) IPSJ (Information Processin...
متن کاملStrategies for high accuracy keyword detection in noisy channels
We present design strategies for a keyword spotting (KWS) system that operates in highly degraded channel conditions with very low signal-to-noise ratio levels. We employ a system combination approach by combining the outputs of multiple large vocabulary automatic speech recognition (LVCSR) systems, each of which employs a different system design approach targeting three different levels of inf...
متن کاملEffective vector quantization for a highly compact acoustic model for LVCSR
This paper introduces a method that can efficiently reduce acoustic model size and computation for LVCSR based on continuous-density hidden Mokov model (CDHMM). The method uses Bhattacharyya distance measure as a criterion to quantize the mean and variance vectors of Gaussian mixture . To minimize the quantization error, the feature vector was separated into multiple streams (such as MFCCs, del...
متن کامل